ensemble model
An Improved Ensemble-Based Machine Learning Model with Feature Optimization for Early Diabetes Prediction
Islam, Md. Najmul, Rimon, Md. Miner Hossain, Shamim, Shah Sadek-E-Akbor, Fahad, Zarif Mohaimen, Mony, Md. Jehadul Islam, Chowdhury, Md. Jalal Uddin
Diabetes is a serious worldwide health issue, and successful intervention depends on early detection. However, overlapping risk factors and data asymmetry make prediction difficult. To use extensive health survey data to create a machine learning framework for diabetes classification that is both accurate and comprehensible, to produce results that will aid in clinical decision-making. Using the BRFSS dataset, we assessed a number of supervised learning techniques. SMOTE and Tomek Links were used to correct class imbalance. To improve prediction performance, both individual models and ensemble techniques such as stacking were investigated. The 2015 BRFSS dataset, which includes roughly 253,680 records with 22 numerical features, is used in this study. Strong ROC-AUC performance of approximately 0.96 was attained by the individual models Random Forest, XGBoost, CatBoost, and LightGBM.The stacking ensemble with XGBoost and KNN yielded the best overall results with 94.82\% accuracy, ROC-AUC of 0.989, and PR-AUC of 0.991, indicating a favourable balance between recall and precision. In our study, we proposed and developed a React Native-based application with a Python Flask backend to support early diabetes prediction, providing users with an accessible and efficient health monitoring tool.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Bangladesh > Sylhet Division > Sylhet District > Sylhet (0.04)
Optimizing Stroke Risk Prediction: A Machine Learning Pipeline Combining ROS-Balanced Ensembles and XAI
Akib, A S M Ahsanul Sarkar, Khawla, Raduana, Hasib, Abdul
Stroke is a major cause of death and permanent impairment, making it a major worldwide health concern. For prompt intervention and successful preventative tactics, early risk assessment is essential. To address this challenge, we used ensemble modeling and explainable AI (XAI) techniques to create an interpretable machine learning framework for stroke risk prediction. A thorough evaluation of 10 different machine learning models using 5-fold cross-validation across several datasets was part of our all-inclusive strategy, which also included feature engineering and data pretreatment (using Random Over-Sampling (ROS) to solve class imbalance). Our optimized ensemble model (Random Forest + ExtraTrees + XGBoost) performed exceptionally well, obtaining a strong 99.09% accuracy on the Stroke Prediction Dataset (SPD). We improved the model's transparency and clinical applicability by identifying three important clinical variables using LIME-based interpretability analysis: age, hypertension, and glucose levels. Through early prediction, this study highlights how combining ensemble learning with explainable AI (XAI) can deliver highly accurate and interpretable stroke risk assessment. By enabling data-driven prevention and personalized clinical decisions, our framework has the potential to transform stroke prediction and cardiovascular risk management.
- Asia > Singapore (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- Europe > Switzerland (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Re(Visiting) Time Series Foundation Models in Finance
Rahimikia, Eghbal, Ni, Hao, Wang, Weiguan
Financial time series forecasting is central to trading, portfolio optimization, and risk management, yet it remains challenging due to noisy, non-stationary, and heterogeneous data. Recent advances in time series foundation models (TSFMs), inspired by large language models, offer a new paradigm for learning generalizable temporal representations from large and diverse datasets. This paper presents the first comprehensive empirical study of TSFMs in global financial markets. Using a large-scale dataset of daily excess returns across diverse markets, we evaluate zero-shot inference, fine-tuning, and pre-training from scratch against strong benchmark models. We find that off-the-shelf pre-trained TSFMs perform poorly in zero-shot and fine-tuning settings, whereas models pre-trained from scratch on financial data achieve substantial forecasting and economic improvements, underscoring the value of domain-specific adaptation. Increasing the dataset size, incorporating synthetic data augmentation, and applying hyperparameter tuning further enhance performance.
- Europe > United Kingdom (0.14)
- North America > Canada > Quebec > Montreal (0.13)
- Europe > Germany (0.04)
- (90 more...)
- Information Technology (1.00)
- Banking & Finance > Trading (1.00)
78211247db84d96acf4e00092a7fba80-AuthorFeedback.pdf
From the feature space's perspective, we can assume that We add several experiments using random-color triggers as shown in Figure 1. CIFAR-100 (Figure 1(b), random target class) to show the marginal effect of dataset and target class choices. Regarding to Reviewer #4's concern about the size of the support set, the choice of black-white and colorful triggers The only prior knowledge is the 3 3 trigger size. Comparing to related works about model ensembling (Review #5). The model ensembling in this work has a completely different motivation.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Oceania > Australia (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (5 more...)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Virginia (0.04)
- Asia > China (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (4 more...)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Information Technology > Security & Privacy (0.93)
- Education (0.68)
A Implementation Details A.1 CIF AR-10 ResNet-29
For all the experimental results on ResNet-29 v2 (He et al., 2016b), we use a batch size The network is trained with Adam optimizer (Kingma et al., 2015) for 200 epochs. We randomly split the training dataset into training data of 45000 images and 5000 images as the validation set. We train a Wide ResNet-28-10 v2 (Zagoruyko & Komodakis, 2016) to obtain the state-of-the-art accuracy for CIFAR-10 (e.g., Table 2 in the main text). For mixup (Zhang et al., 2018; Thulasidasan et al., 2019), the mixing parameter of two images is For CCA T (Stutz et al., 2020), we observe that training models with adversarial examples bounded We train a Wide ResNet-28-10 v2 (Zagoruyko & Komodakis, 2016) to obtain the state-of-the-art accuracy for CIFAR-100. All the experiments on ImageNet were obtained via training a ResNet-101 v1 (He et al., 2016a) following the training script at The input image is normalized (divided by 255) to be within [0,1].
- North America > United States > Virginia (0.04)
- Asia > China > Hong Kong (0.04)
- Europe > Austria (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)